Large-scale manipulation of promoter DNA methylation reveals context-specific transcriptional responses and stability

Background Cytosine DNA methylation is widely described as a transcriptional repressive mark with the capacity to silence promoters. Epigenome engineering techniques enable direct testing of the effect of induced DNA methylation on endogenous promoters; however, the downstream effects have not yet been comprehensively assessed. Results Here, we simultaneously induce methylation at thousands of promoters in human cells using an engineered zinc finger-DNMT3A fusion protein, enabling us to test the effect of forced DNA methylation upon transcription, chromatin accessibility, histone modifications, and DNA methylation persistence after the removal of the fusion protein. We find that transcriptional responses to DNA methylation are highly context-specific, including lack of repression, as well as cases of increased gene expression, which appears to be driven by the eviction of methyl-sensitive transcriptional repressors. Furthermore, we find that some regulatory networks can override DNA methylation and that promoter methylation can cause alternative promoter usage. DNA methylation deposited at promoter and distal regulatory regions is rapidly erased after removal of the zinc finger-DNMT3A fusion protein, in a process combining passive and TET-mediated demethylation. Finally, we demonstrate that induced DNA methylation can exist simultaneously on promoter nucleosomes that possess the active histone modification H3K4me3, or DNA bound by the initiated form of RNA polymerase II. Conclusions These findings have important implications for epigenome engineering and demonstrate that the response of promoters to DNA methylation is more complex than previously appreciated. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02728-5.


Fig. S2. ZF-D3A ChIP-seq replicates show equiparable positional enrichments. A
Intersection between MACS2 ChIP-seq peaks across ZF ChIP-seq samples. Highlighted groups of overlaps are divided in three categories (unique to ZF-D3A-wt, unique to ZF-D3A-mut, or unique to ZF-empty) that are used in panel B. ZF-empty data comes from a MCF-7 line generated with the same zinc finger without any domain fusion obtained from a previous study [1]. Percentage of reads in peaks was obtained for each sample using the sample-specific MACS2 narrowpeak file. B Heatmap depicting raw ChIP-seq coverage on each ZF sample. Read coverage is shown as Counts Per Million (CPM), and input is not subtracted from ChIP data, but shown independently for each cell line. The 4 clusters are defined per overlaps shown in panel A, in which the first represent any peak that overlaps a peak in another condition (e.g. a peak in ZF-D3A-wt_rep1 peak that is also found in ZF-D3A-mut_rep2 will be shown in this cluster). Proportion of peaks overlapping the motif defined in Figure 2a was computed using the scanMotifGenomeWide tool in HOMER2.

Fig. S3. Nucleosome position dictates mCH deposition on closed and open chromatin. A
Heatmaps showing the ChIP-seq signal, mCH %, and ATAC signal on ZF-D3A-wt peaks centered on ZF-D3A motifs. Each ZF-D3A-wt peak can harbour more than one ZF-D3A motif. The motifs are classified according to the intersection with ATAC-seq peaks on the noDox stage. B Motif logo on the neighbouring regions flanking top methylated mCH sites on ZF-peaks.  S4. ZF-D3A-wt expression does not lead to a global reduction of transcriptional activity. A Clustering of the RNA-seq replicates by spearman correlation. B Percentage of reads mapping to ERCC spike-ins in each replicate for each stage. noDox and Dox do not show significant differences (Wilcoxon rank test p > 0.05). C Total amount of RNA measured with the Qubit RNA BR kit after RNA purification. For all samples but for DoxWD (7 days), 50k cells were used, spiked with 2 μL of ERCC Mix1 or Mix 2 (1:100 stock dilution) before RNA purification. For DoxWD, fewer cells could be obtained (13734,50000,42252,13734), but the ratio of spike-in was modified to maintain the same proportion as for the other samples. D Total read count for each sample based on reads mapping to genes and ERCC mRNAs. E MA-plots showing the distribution of transcriptional changes per gene in relationship to transcriptional level as per DEseq2 normalised values. Dashed lines depict a fold change of 2 (log2 transformed). DNMT3A is a positive control for overexpression, since the sequence of ZF-D3A is identical to the endogenous DNMT3A gene, and SOX2 is the originally targeted gene with the predicted best match for the ZF-D3A. F Comparison of the log fold changes of genes between Dox versus noDox for ZF-D3A-wt and Dox versus noDox for ZF-D3A-mut.

Fig. S5. ZF binding and DMR induction led to diverse transcriptional responses. A
Distribution of log fold transcriptional levels between Dox-mut and noDox-mut for differentially expressed genes divided by the presence of a ZF-D3A-mut peak on the promoter of the gene. B Scatter plot of the DMR methylation gain (as defined by weighted methylation difference on the DMR, unlike Fig. 3D), versus the fold change in mRNA abundance of promoter-DMR associated genes, between noDox and Dox. Point color indicates the gene differential expression significance: red indicates FDR <0.05, black indicates FDR >0.05 (DEseq2). Trend line (blue) was fitted using a Local Polynomial Regression. Depicted genes are not differentially expressed in the ZF-D3A-mut.   S8. Differential TF-footprinting at promoter-DMRs. Volcano plots showing the differential footprints assigned to Jaspar core-vertebrate TF motifs across the distinct promoter-DMRs. Right dots are statistically significant footprints calculated by TOBIAS that are depleted upon Dox induction, and blue dots indicate footprints that are gained. The bottom right plot shows TF footprints on all promoter-DMRs that show differences between Dox-mut and noDox, showing a different set of motifs than those depleted in Dox promoter-DMRs. This indicates that ZF binding is not responsible for the loss of these footprints. TSS. H3K27ac and H3K4me1 bigwigs mapped to hg19 were downloaded from the ENCODE database. C TF binding motif enrichments comparing primary TSS versus secondary TSS. Many of the ZF are enriched in CpGs, which reflects the higher CpG densities of primary TSS. D Overlap between alternative TSS usage in ZF-D3A-wt noDox vs Dox and ZF-D3A-mut noDox vs Dox comparisons. Of the primary TSS that are alternatively transcribed in ZF-D3A-mut, only 8 intersect with ZF-D3A peaks, while none of the secondary TSS do.

Fig. S10. Differential ATAC-seq peaks.
A MA-plots showing the differential binding across ATAC-peaks obtained using DEseq2. Only the noDox vs Dox comparison identifies differential peaks, whereas Dox-mut or DoxWD do not show differential peaks. B Genome browser snapshot of a Dox-specific ATAC peak gained on a methylated region. C Intersection between promoter-DMRs classified by transcriptional change of the associated gene upon Dox induction and ATACpeaks. Downregulated genes with promoter-DMRs overlap mostly with ATAC-peaks that lose accessibility, whereas upregulated genes with promoter-DMRs have the highest proportion of peaks that gain accessibility upon Dox induction. D Differential TF footprint calculated for the intersection of all ATAC-peaks in any stage. In red, footprints depleted upon Dox induction, in blue footprints enriched upon Dox induction. E Differential TF footprint on the same set of ATACseq peaks but using Dox-mut data. Many of the motifs that are depleted in Dox versus noDox, are enriched in Dox-mut (FOS, JUN).

Fig. S11. Differential methylation retention at promoters and distal regulatory elements.
Heatmap showing methylation and hydroxymethylation levels, chromatin accessibility, and TF binding on DMRs that retain or lose DMR upon Dox withdrawal, divided in A promoters and B distal regulatory elements. C Distribution of gene expression levels for genes with promoter-DMRs classified as retain-DMR or loss-DMR. D Expression of genes encoding the TET enzymes across all conditions, where each dot represents the normalised expression level (DEseq2) in each replicate. TET2 and TET3 are downregulated upon Dox induction. E Distribution of hydroxymethylcytosine (hmCG) values on promoters or distal regulatory elements. Those regions that do not overlap DMRs are displayed in grey, those that overlap Retain-DMRs are shown in green, and those that overlap Loss-DMRs shown in blue. An asterisk indicates a significant onesided t-test (p < 0.0001), two asterisks indicate a lower p-value. F Average % hmCG on DMRs and TSS of genes with DMRs. The boxplots centre lines are medians, box limits are quartiles 1 (Q1) and 3 (Q3), whiskers are 1.5 × interquartile range (IQR), and points are outliers.

Fig. S12. Prolonged Dox induction does not increase methylation gain.
Average CpG methylation levels on seven gene promoters obtained through bisulfite PCR amplicon sequencing in duplicates. Dox_100 indicates 100 ng/ul of Dox used per induction, whereas Dox_1000 indicates 1000 ng/ul. 3d and 6d indicate 3 days and 6 days of Dox induction, respectively. Colour coded as per cell line construct. Red indicates ZF-D3A-wt, blue indicates ZF-D3A-mut, and green is ZF-mCherry.  Table S1. Global WGBS stats for all samples. Global methylation levels at CpG sites on the human genome and the unmethylated lambda spike in control. Table S2. List of promoters with DMRs and transcriptional change. Coordinates of the DMRs (FDR < 0.01) intersecting promoters, with dmrseq statistics. Associated gene RNA-seq values as computed per DEseq2, for the noDox vs Dox comparison. Classified by category of transcriptional change and per intersection with ZF-D3A-wt. Those genes that are also differentially expressed in the ZF-D3A-mut noDox-mut vs Dox-mut are also shown. Table S3. Alternative TSS in noDox vs Dox comparison. SEASTAR alternative first exon coordinates with the differential Percent Splice Inclusion (PSI) values. Table S4. Alternative TSS in noDox-mut vs Dox-mut comparison. SEASTAR alternative first exon coordinates with the differential Percent Splice Inclusion (PSI) values.